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Abstract — The growing popularity of location-based sys- 
tems, allowing unknown/untrusted servers to easily collect and 
process huge amounts of users' information regarding their 
location, has recently started raising serious concerns about 
the privacy of this kind of sensitive information. In this paper 
we study geo-indistinguishability, a formal notion of privacy 
for location-based systems that protects the exact location of a 
user, while still allowing approximate information - typically 
needed to obtain a certain desired service - to be released. 

Our privacy definition formalizes the intuitive notion of 
protecting the user's location within a radius r with a level 
of privacy that depends on r. We present three equivalent 
characterizations of this notion, one of which corresponds to 
a generalized version of the well-known concept of differential 
privacy. Furthermore, we present a perturbation technique for 
achieving geo-indistinguishability by adding controlled random 
noise to the user's location, drawn from a planar Laplace 
distribution. We demonstrate the applicability of our technique 
through two case studies: First, we show how to enhance 
applications for location-based services with privacy guarantees 
by implementing our technique on the client side of the 
application. Second, we show how to apply our technique to 
sanitize location-based sensible information collected by the US 
Census Bureau. 

I. Introduction 

In recent years, the increasing availability of location 
information about individuals has led to a growing use of 
systems that record and process location data, generally 
referred to as "location-based systems". Such systems in- 
clude (a) Location Based Services (LBSs), in which a user 
obtains, typically in real-time, a service related to his current 
location, as well as (b) location-data mining algorithms, used 
to determine, among others, points of interest and traffic 
patterns. 

The use LBSs, in particular, has been has significantly 
increased by the growing popularity of mobile devices 
equipped with GPS chips, in combination with the increasing 
availability of wireless data connections. A resent study 
in the US shows that 46% of the adult population of the 
country owns a smartphone and, furthermore, that 74% of 

This work has been partially supported by the European Union Seventh 
Framework Programme under the grant agreement no. 295261 (MEALS), 
by the ANR-1 1-IS02-0002 project LOCALI, and by the INRIA Large Scale 
Initiative CAPPRIS. The work of Nicolas E. Bordenabe has been partially 
funded by the French Defense procurement Agency (DGA) by means of a 
PhD grant. 



those owners use LBSs Q. Examples of LBSs include 
mapping applications (eg, Google Maps, Bing Maps), Points 
of Interest (POI) retrieval (eg, AroundMe, Localscope), 
coupon/discount providers (eg, GroupOn, Yowza), GPS nav- 
igation (eg, TomTom), and location-aware social networks 
(eg. Foursquare, OkCupid). 

While location-based systems have demonstrated to pro- 
vide enormous benefits to individuals and society, the grow- 
ing exposure of users' location information raises important 
privacy issues that are, unfortunately, often overlooked. 
On the one side, location information itself is commonly 
considered by individuals as sensitive. More importantly, 
location data can be easily linked to a variety of other 
information that an individual might wish to protect; by 
collecting and processing accurate location information on a 
regular basis, it is possible to infer an individual's home or 
work location, sexual preferences, political views, religious 
inclinations, etc. In its extreme form, monitoring and control 
of an individual's location has been even described as a form 
of slavery |2|. 

As a consequence, several notions of privacy for location- 
based systems have been proposed in the literature, many of 
them being variations of the /c-anonymity concept, together 
with techniques to achieve these privacy guarantees. In 
Section [ll| we give an overview of such existing notions of 
location privacy, and we discuss some of their shortcomings 
in relation to our motivating example of a real-time LBS. 
Aiming at addressing these shortcomings, we present a novel 
formal privacy definition for location-based systems, as well 
as a technique that allows users to disclose their location 
while satisfying the aforementioned privacy guarantee. 

As a motivating example, we consider a user located in 
Paris who wishes to query an LBS provider for nearby 
restaurants in a private way, i.e. by disclosing some ap- 
proximate information z instead of his exact location x. 
Note that, in contrast to various works in the literature, we 
assume that the user is interested in hiding his location, not 
his identity. In fact, the user might be willing to disclose 
his identity to the provider, in order to obtain personalized 
recommendations, or to participate in a social network. A 
crucial question then is, what kind of privacy does the user 
expect to have in this scenario? On the one hand, he does 
not expect to reveal his exact location but, on the other hand. 







s«..„-. 






Pans 


J • r r 


f 











Figure 1. Geo-indistinguishability: different levels of privacy for each r 

he wishes to obtain a service tailored to it. Thus, the user's 
requirement is that, by obtaining z, the provider should be 
able to infer x approximately but not accurately. 

To capture this requirement, we use the notion of privacy 
within a radius. We fix a circle of radius r centered at the 
user's location x, and reason about the user's level of privacy 
within this radius. Roughly speaking, we say that the user 
enjoys i-privacy within r if, by observing z, the provider's 
ability to infer x among all points within the radius r, does 
not increase (compared to the case when z is unknown) by 
more than a factor depending on £. The idea is that £ is 
the (inverse of) user's level of privacy for that radius: the 
smaller £ is, the higher privacy the user enjoys (as it gets 
harder for the provider to detect the user's location among 
the locations within this circle). 

Then, in order to allow the provider to learn x approx- 
imately but not accurately, we require a different level of 
privacy / for each radius; in particular we require that / 
decreases proportionally with r, which brings us to our (still 
informal) definition of geo-indistinguishability: 

A mechanism satisfies e-geo-indistinguishability 
iff for any radius r > 0, the user enjoys er-privacy 
within r. 

This definition requires that the user is protected within any 
radius r, but with a level £{r) = er that increases with 
the distance. Within a short radius, for instance r = 1 km, 
£{r) is small, guaranteeing that the provider cannot infer the 
user's location within, say, the 7th arrondissement of Paris. 
On the other hand, privacy decreases for locations farther 
away from the user; taking for instance r = 10.000 km, £{r) 
becomes very large, allowing the LBS provider to infer that 
with high probability the user is located in Paris instead of, 
say, London. The idea of different privacy levels for each 
radius is illustrated in Figure [T] 



Geo-indistinguishability is formalized in Section III 
where we present three equivalent characterizations offering 
alternative interpretations of our privacy guarantee. One of 
the characterizations corresponds to a generalized version of 
differential privacy, a well-known notion of privacy from the 
area of statistical databases |3|. This relationship emphasizes 
the fact that - like differential privacy - our notion abstracts 



from the side information of the user, such as any prior 
probabilistic knowledge about the user's actual location. 

Furthermore, we develop a mechanism to achieve geo- 
indistinguishability, by reporting a randomly selected point 
z, obtained by perturbating the user's location x. The in- 
spiration for our mechanism comes from one of the most 
popular approaches for differential privacy, namely drawing 
noise from a Laplace distribution. This distribution, how- 
ever, is one-dimensional, while a planar (two-dimensional) 
mechanism is required to generate noise for location values. 
Nevertheless, the Laplace distribution can be extended to 
the continuous plane obtaining, in this way, a distribution 
to draw noise for location values in a geo-indistinguishable 
fashion. Moreover, via a transformation to polar coordinates, 
we are also able to devise a simple and efficient method 
to draw points. However, as standard (digital) applications 
require a finite representation of locations, it is necessary 
to add a discretization step after randomly generating z, 
such operation degrades the level of privacy provided by 
the mechanism. Quantifying such degradation of privacy 
imposes several technical challenges; we show how to 
overcome them and how to adjust the privacy parameters of 
our mechanism in order to obtain a desired level of privacy. 

We conclude our work by demonstrating the applicability 
of our approach through two case studies, one based on 
LBSs and the other on location-data mining. In the former 
case, we show that, by trading privacy for bandwidth usage, 
geo-indistinguishability can be obtained without degrading 
the utility of the information provided by the LBS. In the 
latter case, we show how to apply our technique to sanitize 
datasets containing geographical information. In particular, 
we show how to sanitize publicly available geographic infor- 
mation released by the US Census Bureau. Our experiments 
reveal that providing geo-indistinguishability to all users 
in the dataset (i.e., US inhabitants) does not significantly 
decrease the quality of the sanitized data (the degree of 
decrease being inversely proportional to the parameters £ 
and r of the privacy guarantee). 

Road Map: In Section 2 we discuss notions of loca- 
tion privacy from the literature and point out their weak- 
nesses and strengths. In Section 3 we formalize the notion 
of geo-indistinguishability in three equivalent ways. We 
then proceed to describe a mechanism that provides geo- 
indistinguishability in Section 4. In Sections 5 and 6 we 
demonstrate the applicability of our approach by case studies 
related to LBSs and Location-Data Mining, respectively. 
Section 7 discusses related work; Section 8 concludes. All 
proofs are in the appendix. 

II. Existing Notions of Location Privacy 

In this section, we examine various notions of privacy 
from the literature, as well as techniques to achieve them. We 
consider the motivating example from the introduction, of a 
user in Paris wishing to find nearby restaurants with good 



reviews. To achieve this goal, he uses a handheld device (eg. 
a smartphone) to query a public LBS provider. However, 
the user expects his location to be kept private: informally 
speaking, the information sent to the provider should not 
allow him to accurately infer the user's location. Our goal 
is to provide a formal notion of privacy that adequately 
captures the user's expected privacy. From the point of 
view of the employed mechanism for achieving privacy, we 
require a technique that can be performed in real-time by a 
handheld device such as a smartphone, without the need of 
any trusted anonymization party. 

A. k- anonymity 

The notion of /c-anonymity is the most widely used defi- 
nition of privacy for location-based systems in the literature. 
Many systems in this category (IH, (Sj, (51) aim at protecting 
the user's identity, requiring that the attacker cannot infer 
which user is executing the query, among a set of k different 
users. Such systems are outside the scope of our problem, 
since we are interested in protecting the user's location. 

On the other hand, /c-anonymity has also been used to 
protect the user's location (sometimes called /-diversity in 
this context), requiring that it is indistinguishable among 
a set of k points (often required to share some semantic 
property). One way to achieve this is through the use of 
dummy locations (Q, O). This technique involves generat- 
ing k — 1 properly selected dummy points, and performing 
k queries to the service provider, using the real and dummy 
locations. Another method for achieving /c-anonymity is 
through cloacking (Igl, ifTOl . ifTTIl ). This involves creating 
a cloacking region that includes k points sharing some 
property of interest, and then querying the service provider 
for this cloacking region. 

The main drawback of /c-anonymity-based approaches in 
general is that a system cannot be proved to satisfy this 
notion unless assumptions are made about the attacker's 
side information. For example, dummy locations are only 
useful if they look equally likely to be the real location 
from the point of view of the attacker. Any side information 
that allows to rule out any of those points, as having low 
probability of being the real location, would immediately 
violate the definition. 

Counter-measures are often employed to avoid this issue: 
for instance, | 7 1 takes into account concepts such as ubiquity, 
congestion and uniformity for generating dummy points, 
in an effort to make them look realistic. Similarly, lITTIl 
takes into account the user's side information to construct 
a cloacking region. Such counter-measures have their own 
drawbacks: first, they complicate the employed techniques, 
also requiring additional data to be taken into account, 
making their application in real-time by a handheld device 
challenging. Moreover, the attacker's actual side information 
might simply be inconsistent with the assumptions being 
made. 



As a result, notions that abstract from the attacker's side 
information, such as differential privacy, have been growing 
in popularity in recent years, compared to /c-anonymity- 
based approaches. 

B. Differential Privacy 

Differential Privacy (111) is a notion of privacy from the 
area of statistical databases. Its goal is to protect an indi- 
vidual's data while publishing aggregate information about 
the database. Differential privacy requires that modifying a 
single user's data should have a negligible effect on the query 
outcome. More precisely, it requires that the probability that 
a query returns a value v when applied to a database D, 
compared to the probability to report the same value when 
applied to an adjacent database D' - meaning that I), D' 
differ in the value of a single individual - should be within a 
bound of e^. A typical way to achieve this notion is to add 
controlled random noise to the query output, for example 
drawn from a Laplace distribution. An advantage of this 
notion is that a mechanism can be shown to be differentially 
private independently from any side information that the 
attacker might possess. 

Differential privacy has also been used in the context 
of location privacy. In fT2l, it is shown that a synthetic 
data generation technique can be used to publish statisti- 
cal information about commuting patterns, while satisfying 
differential privacy. In fTT|, a quadtree spatial decomposition 
technique is used to ensure differential privacy in a database 
with location pattern mining capabilities. 

As shown by the aforementioned works, differential pri- 
vacy can be successfully applied in cases where aggregate 
information about several users is published. On the other 
hand, the nature of this notion makes it poorly suitable for 
applications in which a single individual is involved, such 
as our motivating scenario. The secret in this case is the 
location of a single user. Thus, differential privacy would 
require that any change in that location should have negligi- 
ble effect on the published output, making it impossible to 
communicate any useful information to the service provider. 

C. Transformation-based approaches 

A number of approaches for location privacy are radically 
different from the ones mentioned so far. Instead of cloaking 
the user's location, they aim at making it completely invisi- 
ble to the service provider. This is achieved by transforming 
all data to a different space, usually employing cryptographic 
techniques, so that they can be mapped back to spatial 
information only by the user ( lfT4l . ifTSl ). The data stored 
in the provider, as well as the location send by the user are 
encrypted. Then, using techniques from Private Information 
Retrieval, the provider can return information about the 
encrypted location, without ever discovering which actual 
location it corresponds to. 



A drawback of these techniques is that they are compu- 
tationally demanding, making it difficult to implement them 
in a handheld device. Moreover, they require the provider's 
data to be encrypted, making it impossible to use popular 
providers, such as Google Maps, which have access to the 
real data. 

III. Geo-Indistinguishability 

In this section we formalize our notion of geo-indistingui- 
shability. As already discussed in the introduction, the main 
idea behind this notion is that privacy is considered wrt 
a certain radius r, with a level that decreases proportion- 
ally with r. More precisely, a mechanism satisfies e-geo- 
indistinguishability iff for any radius r > 0, the user enjoys 
er-privacy within r. So far we kept the discussion on an 
informal level by avoiding to explicitly define what ^-privacy 
within r means. In the remaining of this section we formalize 
this notion in three different ways; all of them turn out to be 
equivalent, but they are all useful for understanding in depth 
the privacy guarantees provided by geo-indistinguishability. 

Note that the parameter e corresponds to the level of 
privacy at one unit of distance. For the user, a simple way to 
specify his privacy requirements is by a tuple r), where r 
is the radius he is mostly concerned with and £ is the privacy 
level he wishes for that radius. In this case, it is sufficient 
to require e-geo-indistinguishability for e = £/r; this will 
ensure a level of privacy £ within r, and a proportionally 
selected level for all other radii. 

A. Probabilistic model 

We introduce here the simple probabilistic model that is 
used in the rest of the paper. We start with a set X of points 
of interest, typically the user's possible locations. Moreover, 
let Z be a set of possible reported values, which in general 
can be arbitrary, although for our needs we consider Z to 
also contain spatial points. In our operational scenario, the 
user is assumed to be at the location x G A*. He then selects 
a point z ^ Z which is made available to the attacker (for 
instance, it is reported to an untrusted service provider). 

Probabilities come into place in two ways. First, the at- 
tacker might have side information about the user's location, 
knowing, for example, that he is likely to be visiting the 
Eiffel Tower, while unlikely to be swimming in the Seine 
river. Let X be the random variable giving the user's location 
(ranging over X)\ the attacker's side information can be 
modelled by a prior distribution Px for X, where Px{x) 
is the probability assigned to the location x. 

Second, the selection of a point in Z is itself probabilistic; 
for instance, z can be obtained by adding random noise to 



X a probability distribution for Z, where JC{x){S) is the 
probability that the reported point belongs to the set 5 C Z, 
when the user's location is Together, Px and JC induce 
a joint probability distribution P for X, Z, as P{x,S) = 
Px{x)JC{x){S). Note that, by construction, P{x) = Px{x) 
and P{S\x) = K:{x){S). 

B. First approach 

We return to the issue of formalizing what ^-privacy 
within a radius r means. An intuitive way of doing so, is 
to compare the probabilities of different locations within r, 
after seeing a reported point in S C Z (note that we always 
consider sets of reported points, to allow for continuous 
distributions). Let x^x' G A', such that d{x^x') < r, 
where d{-^-) denotes the Euclidean distance between points. 
Ideally, we would like to require that p{x'\s) < e^, 

meaning that for a small the attacker assigns similar 
probabilities to the user being located in x or x' after 
observing S. 

However, we would like our definition to hold for any 
side information that the attacker might have, meaning 
for all priors Px- Intuitively, we cannot expect the above 
condition to hold for all priors, since two locations x, x' with 
very different prior probabilities (eg. the Eiffel Tower vs a 
location in the Seine) will also have different probabilities 
after the observation S. In other words, if Pi^)/p{x) is 
large, we cannot expect the corresponding fraction after 
observing S to be small. What we can expect, however, is 
that the two fractions, before and after the observation, are 
similar, meaning that S has limited effect to the probabilities 
assigned by the attacker. This brings us to our first formal 
definition of geo-indistinguishability: 

Geo-indistinguishability-I: A mechanism satisfies e-geo- 
indistinguishability iff for all priors Px and all C Zr] 



P{x\S) ^ P{x) 
P{x'\S) - P{x') 



Vr > Vx, x' : d{x^ x') < r 



the actual location x (a technique used in Section IV). The 
probabilistic function for selecting a reported value based 
on the actual location is called a mechanism. Let Z be the 
random variable giving the reported point; a mechanism /C 
for selecting 2; is a function assigning to each location x G 



C. Second approach 

A second approach for defining privacy within a radius r, 
is to focus on a single location x and compare the probability 
of X before and after the observation. Ideally, we would like 
to require that Pi^\s)/p{x) < e^, meaning that for a small 
the probability of x should not be affected by the observation 
S. However, this requirement is clearly too strong since 
some information is allowed to be leaked: a location in Paris 
might have negligible prior probability, since the user could 
be located anywhere in the world, while after the observation 
its probability is substantially increased. 

^For simplicity we assume X to be discrete, but allow Z to be continuous 
since we use continuous distributions in Section |lV] Thus we consider 
probabilities of sets of points, implicitly assuming to be measurable. 

^Note that for the sake of readability, we express the definitions in 
terms of fractions. To avoid issues with zero probabilities, we can write 
all definitions in flat form, i.e. P{x\S)P{x') < e^"" P{x'\S)P{x). 



Remember, however, that we are interested in privacy 
within the radius r. Let Br{x) be the set of locations at 
distance at most r from x. Since we are interested in the 
attacker's capabiHty of locating the user withing this radius, 
we condition all probabilities on the event Br{x). In other 
words, we reason about how accurately the attacker could 
infer a particular location x, if he already knew that the 
location was within Br{x). This brings us to our second 
definition of geo-indistinguishability: 

Geo-indistinguishability-II: A mechanism satisfies e-geo- 
indistinguishability iff for all priors Px and slU S C Z\ 

P{X\S, Br{x)) 



P{x\Br{x)) 



Vr > Vx G A' 



D. Third approach 

So far, we have considered the probability that the attacker 
assigns to locations before and after observing S, since com- 
paring these probabilities is a natural way to quantify how 
much S helps the attacker. We now change our standpoint 
and consider instead the probabilities of observations, in- 
stead of locations. Intuitively, if two locations x' produce a 
reported value in S with similar probabilities, then S reveals 
little information about whether the actual location is x or 
x\ Thus, it is natural to require that Pis\x)/pis\x') < for 
locations that lie within the radius r. This brings us to our 
final definition of geo-indistinguishability 

Geo-indistinguishability-III: A mechanism satisfies e-geo- 
indistinguishability iff for slU S C Z: 

JC{x){S) 



JC{x'){S) 



Vr > Vx, x' : d{x, x) < r 



This definition requires that locations within close dis- 
tance produce observations with similar probabilities. The 
farther away two locations are, the more different we allow 
the probabilities of producing S to be. This is very similar 
to the definition of differential privacy, which requires two 
databases that differ on a single individual to produce the 
same answer with similar probabilities, while databases that 
differ on many individuals are allowed to give an answer 
with different probabilities. 

Note that differential privacy aims at completely pro- 
tecting the value of an individual, requiring that arbitrary 
changes in his value should have negligible effect on the 
reported answer. In our scenario, however, such a require- 
ment would be too strong, since the only information is 
the location of a single individual. Nevertheless, we are 
not interested in completely hiding the user's location, since 
some approximate information needs to be revealed in order 
to obtain the required service. This is achieved using a level 
of privacy that depends on the distance between locations. 

^Note that since P{S\x) = }C{x){S), this definition can be given only 
in terms of JC, independently from the prior Px . 



Still, the connection between geo-indistinguishability and 
differential privacy is strong. In fact, the above definition is 
an instance of a generalized variant of differential privacy 
(CSl, ifTTl . ifTSl ) which takes into account an arbitrary 
metric between secrets, where standard differential privacy 
corresponds to the so-called Hamming distance. In |16J 
the generalized definition is used to perform a composi- 
tional analysis of standard differential privacy for functional 
programs, while |17| uses metrics between individuals to 
define "fairness" in classification. In a companion paper 
(Hi, we study the generalized definition and show that 
different metrics provide different notions of privacy which 
can be suitable in various applications. This paper focuses 
on location-based systems and is, to our knowledge, the first 
work considering privacy under the Euclidean metric, which 
is a natural choice for spatial data. 

Finally, we can show that the three definitions of geo- 
indistinguishability given in this section are simply different 
ways of expressing the same privacy requirement. 

Theorem 3.1: Geo-indistinguishability-I, II, III coincide. 
A note on the unit of measurement: Since the notion 
of distance between points is crucial for the definition of 
geo-indistinguishability, a natural question is: how is the 
definition affected by the unit in which distance is measured? 
Changing the unit causes all distances to be scaled; still, such 
a change should clearly not affect the privacy guarantees of 
a mechanism. The crucial point here is that, if r is a physical 
quantity expressed in some unit of measurement, then e has 
to be expressed in the inverse unit, so that ^ = er is a 
pure number, thus it needs to be updated when the unit of 
measurement changes. For simplicity in the rest of this paper 
we omit the unit since the choice is orthogonal to our goals. 

E. Protecting multiple locations 

So far, we have assumed that the user has a single location 
that he wishes to communicate to a service provider in 
a private way (typically his current location). In practice, 
however, it is common for a user to have multiple points 
of interest, for instance a set of past locations or a set of 
locations he frequently visits. In this case, the user might 
wish to communicate to the provider some information that 
depends on all points, for instance the set of points itself, 
their centroid, etc. As in the case of a single location, privacy 
is still a requirement; the provider is allowed to obtain 
only approximate information about the locations, their exact 
value should be kept private. In this section, we discuss 
how e-geo-indistinguishability extends to the case where the 
secret is a tuple of points x = (xi, . . . , x^). 

Similarly to the case of a single point, the notion of 
distance is crucial for our definition. We define the distance 
between two tuples of points x = (xi, . . . , x^), x' = 
as: 

(ioo(x, x') = max^ d{xi, x'-) 



Intuitively, the choice of metric follows the idea of reasoning 
within a radius r: when (ioo(x, x') < r, it means that all 
Xi , x^j^ are within distance r from each other. 

All definitions of this section can be then directly applied 
to the case of multiple points, by using doo as the underlying 
metric. Enjoying ^-privacy within a radius r means that the 
observation can help the attacker to infer x among all tuples 
at distance r (i.e. tuples having all points at distance r from 
the corresponding points of x), by a factor of at most eK All 
three definitions of geo-indistinguishability remain the same, 
the only change being the set of secrets and the distance 
between them. 

Extending a mechanism to multiple points: A natural 
question then to ask is whether we can create a mechanism 
for tuples of points, by independently applying an existing 
mechanism JCq to each individual point, and report a tuple 
of values. Starting from a tuple x = (xi,...,^^), we 
independently apply JCq to each Xi obtaining a reported 
point Zi, and then report the tuple z = (zi, . . . , z^). Thus, 
the probability that the combined mechanism JC reports z, 
starting from x, is the product of the probabilities to obtain 
each point Zi, starting from the corresponding point Xi, i.e. 

/C(x)(z) = ni/Co(x,)(z,)Q 

The next question is what level of privacy does JC satisfy. 
For simplicity, consider a tuple of only two points (xi,X2), 
and assume that JCq satisfies e-geo-indistinguishability. At 
first look, one might expect the combined mechanism JC to 
also satisfy e-geo-indistinguishability, however this is not the 
case. The problem is that the two points might be correlated, 
thus an observation about xi will reveal information about 
X2 and vice versa. Consider, for instance, the extreme case 
in which xi = X2- Having two observations about the same 
point reduces the level of privacy, thus we cannot expect the 
combined mechanism to provide the same level of privacy. 
Still, JC can be shown to satisfy privacy with a level that 
scales linearly with n: 

Theorem 3.2: If JCq satisfies e-geo-indistinguishability, 
then JC satisfies ne-geo-indistinguishability. 

Note that this issue is similar to the problem of composing 
queries in standard differential privacy. If the outcome of 
multiple queries is randomized by adding independent noise 
to each answer, then e scales linearly with the number of 
queries. The reason is exactly that the answers are correlated, 
since they come from the same database. 

Due to this scalability issue, the technique of indepen- 
dently applying a mechanism to each point is only useful 
when the number of points is small. Still, this is sufficient for 
some applications, such as the case study of Section |V] Note 
also that this technique is by no means optimal: similarly to 
standard differential privacy ((191, O). better results could 
be achieved by adding noise to the whole tuple x, instead 

^For simplicity we consider probabilities of points here; a formal 
treatment of continuous mechanism would require to consider sets. 



of each individual points. Developing such techniques for 
geo-indistinguishability is left as future work. 

The case of uncorrelated points: In the previous para- 
graph we saw that when a mechanism is independently 
applied to multiple points, e increases linearly with the 
number of points, that the points can be correlated. On 
the other hand, we are sometimes interested in applying 
a mechanism to uncorrelated points, that is points that are 
either selected independently from each other, or for which 
we can assume that the attacker has no information about 
their correlation. This can be captured by requiring that 
the probability to select Xi is independent from xj and 
vice versa, that is P(x) = Yl- P{xi) (note that P{xi) is 
still arbitrary). Under this restriction, an observation about 
Xj does not intuitively reveal any information about Xi. 
Assuming that JCi satisfies e^ -geo-indistinguishability, it can 
be shown that the combined mechanism JC satisfies the 
same level of privacy wrt the individual point Xi, that is 
Pi^lxi) < e^^P(z|x-) for all Xi^x[ such that d{xi^x^^ < r. 
Note that the e-geo-indistinguishability might not be satisfied 
for the tuple x (we need to take ne for this purpose). Still, as- 
suming the lack of correlation, e-geo-indistinguishabilitywill 
be satisfied for each individual point Xi. 

E Comparison with standard differential privacy 



As discussed in Section III-D geo-indistinguishability is 
an instance of a generalized version of differential privacy, 
using the Euclidean metric to measure the distance between 
secrets. Thus, it is natural to examine how this notion com- 
pares to the one of standard differential privacy. As discussed 
in Section |ll| an advantage of geo-indistinguishability is that 
it can be applied to scenarios involving a single user, for 
which differential privacy is poorly suited. The comparison 
becomes more interesting in the case where secrets are tuples 
of n points, each corresponding to a different user. Note that 
we try to keep the discussion at a high level, focusing mainly 
on the privacy guarantees of each notion, and abstracting 
from the exact application. 

Consider two mechanisms, JCi satisfying ei -geo- 
indistinguishability and JC2 satisfying e2 -differential privacy. 
Note that simply comparing ei , €2 is meaningless, since 
they refer to different definitions. To do a fair comparison, 
let X = (xi,X2, . . . x' = (xi,X2, . . . be two 

tuples differing only in the location of the first user (i.e. 
seen as databases, they are adjacent). We then consider the 
level of privacy that each mechanism provides for those 
tuples, which corresponds to how well the secret of the 
first user is protected. The privacy levels £1 , £2 of JCi , JC2 
respectively, for those tuples, is: 



h = ei(ioo(x,x') = eid{xi,x[) 



^2 



in the sense that, for both mechanisms, the ratio 
/c^(x)(5y/c.(x')(S') is bounded by e^^ for all observations 



S. Thus, comparing the two mechanisms boils down to 
comparing ^1,^2, for various points xi^x[. 

An important observation is that £2 is independent from 
the actual points xi^x'i. This means that standard differential 
privacy protects all values in the same way; any secret value 
of a user is equally indistinguishable from any other. This 
is not the case for £1 , however, which depends on the actual 
points xi^x[, and more precisely on their distance. So, the 
level of protection depends on the secrets: the closer two 
points are the harder it is for the attacker to distinguish them. 

Thus, for points far away from each other, ii will be 
greater than £2, so differential privacy offers better protec- 
tion, while geo-indistinguishability becomes better in points 
close to each other, for which ii is smaller than £2- This 
behaviour becomes more important in cases where €2 is 
"weak", which is often unavoidable in order to provide 
acceptable utility (see, for instance. Section [Vl|). Intuitively, 
when 62 is large, then offering the same protection £2 = ^2 
for all points becomes a drawback. A privacy level that 
depends on the distance ensures that nearby points (which, 
in the case of location-based systems, need to be highly 
indistinguishable), will be adequately protected. 

Finally, when comparing notions of privacy, one needs 
to also examine the loss of utility caused by the added 
noise. This highly depends on the application: differential 
privacy is suitable for publishing aggregate queries with low 
sensitivity, meaning that changes in a single individual have 
a relatively small effect on the outcome. On the other hand, 
location information often has high sensitivity. A trivial 
example is the case where we want to publish the complete 
tuple of points. But sensitivity can be high even for aggregate 
information: consider the case of publishing the centroid of 5 
users located anywhere in the world. Modifying a single user 
can hugely affect their centroid, thus achieving differential 
privacy would require so much noise that the result would 
be useless. For geo-indistinguishability, on the other hand, 
one needs to consider the distance between points when 
computing the sensitivity. In the case of the centroid, a small 
(in terms of distance) change in the tuple has a small effect 
on the result, thus geo-indistinguishability can be achieved 
with much less noise. 

IV. A MECHANISM FOR GEO-INDISTINGUISHABILITY 

In this section we present a method to generate noise in 
a way that satisfies geo-indistinguishability. We model the 
location domain as the Euclidean plane equipped with the 
standard notion of Euclidean distance. This model can be 
considered a good approximation of the Earth surface when 
the area of interest is not "too large". 

For applications with digital interface the domain of 
interest is discrete, since the representation of the coordi- 
nates of the points is necessarily finite. However, it does 
not seem easy to devise an efficient mechanism for geo- 
indistinguishability that generates noise directly on a discrete 




Figure 2. Two linear Laplacians (with e = In 2) centered in 1 and 2. The 
ratio between the two curves is at most = 2 everywhere. 



plane (we will come back to this point in Section [IV-B| ). We 
therefore consider a different approach: 

(a) First, we define a geo-indistinguishable, continuous 
mechanism for the ideal case of the continuous plane. 

(b) Then, we discretized the mechanism by remapping each 
point generated according to (a) to the closest point in 
the discrete domain. 

Furthermore, we may want to consider only a limited area. 
For instance if we are in a island, we may wish to report 
only locations in the land, not in the sea. Thus we may want 
to apply a third step: 

(c) If desirable, we may truncate the mechanism, so to 
report only points within the limits of the area of interest. 

A. A geo-indistinguishable continuous mechanism 

In this section we explore how to define a geo- 
indistinguishable mechanism on the continuous plane. This 
will constitute the basis of our method. 

The idea is that whenever the actual location is xq G 
R^, we report, instead, a point x eM? generated randomly 
according to the noise function. The property that we need 
to guarantee is that the probabilities of reporting a point 
in a certain (infinitesimal) area around x when the actual 
locations are xq and Xq respectively, should differ at most 
by a multiplicative factor e~^^(^0'^o). 

Intuitively, this property is achieved if the noise function 
is such that the probability of generating a point in the area 
around x decreases exponentially with the distance from 
the actual location xq. In a linear space this is exactly the 
behavior of the Laplace distribution, with probability density 
function (pdf) e/2e~^l^~^l (where /i, e are parameters). This 
distribution has been used in the literature to add noise to 
query results on statistical databases, with /i set to be the 
actual answer, and it can be shown to satisfy e-differential 
privacy (f2Tj). Figure |2] illustrates the idea. 

Of course we cannot use the standard Laplace distribution 
for our purposes, because it is defined on the line, while 
we need a distribution defined on the plane. Furthermore 
we need to use the (Euclidean) planar distance d{x^ii) 
instead of the liner distance \x — Intuitively, however, 
just replacing \x — ii\ hy d{x^ii) in the Laplace's pdf results 
in a natural extension of the Laplace distribution from one to 




Figure 3. The pdf's of two planar Laplacians, centered in (—2, —4) and in 
(5, 3) respectively, with e = 1/5. The distance between the centers is 7 \f2, 
and the ratio between the curves is at most e^/^ ^ ~ 7.24 everywhere. 

two dimension^ We call planar Laplacian such extension. 

The probability density function: Given the parameter 
e G and the actual location xq G M?, the pdf of our 
noise mechanism, on any other point x G M^, is: 

D,{xo){x) = — e-^'^^^O'^) (1) 

2 TT 

where e^/2 tt is a normalization factor. Using a transformation 
in polar coordinates it is possible to show that the integral 
of this function over the whole M? gives 1, which means 
that it is indeed the pdf of a probability distribution. 

We call this function planar Laplacian centered in xq. 
The corresponding distribution is illustrated by Figure |3] 
Note that the projection of a planar Laplacian on any vertical 
plane passing by the center gives a graph proportional to the 
one of a linear Laplacian (Figure [2]). In Appendix [B] we show 
that the mechanism defined by a planar Laplacian satisfies 
e-geo-indistinguishability. 

Drawing a random point: We illustrate now how to 
draw a random point from the pdf defined in ([T]). 

First of all, we note that the pdf of the planar Laplacian 
depends only on the distance from xq . It will be convenient, 
therefore, to transform the reference system into a system of 
polar coordinates with origin in xq. Intuitively, in this way 
the pdf will depend only on one variable, thus simplifying 
the drawing procedure. 

So, given the pdf in ([T]), we consider the transformation 
into a system of polar coordinates (r, ^) where r is the radius 
and 6 is the angle. A point x in Cartesian coordinates will be 
represented as a point (r, 6>) in the new system, where r is the 
distance of x from xq, and is the angle that the line xxq 
forms with respect to the horizontal axis of the Cartesian 
system. Following the standard transformation method, the 

^In the literature there are various proposals for the extension of the 
Laplace distribution to higher dimensions. These are called multivariate 
Laplacians. In general multivariate means that it involves k > 1 random 
variables. The particular cases of k = 1 and k = 2 are called univariate and 
bivariate respectively. Our definition corresponds to a particular instance 
of the extension investigated in |22|. The same instance has been adopted 
also in L23J . 




(a) (b) 



Figure 4. Gamma distribution: pdf and cdf for various values of e. 

pdf of the polar Laplacian centered in the origin (xq) is: 
D,{r,0) = ^re-'^ (2) 

Z TT 

We note now that the polar Laplacian defined above 
enjoys a property that is very convenient for drawing in an 
efficient way: the two random variables that represent the 
radius and the angle are independent. Namely, the pdf can 
be expressed as the product of the two marginals. In fact, 
let us denote these two random variables by R (the radius) 
and 6 (the angle). The two marginals are: 

Hence we have D^{r,e) = D^^nir) D^^q^O). 

Note that D^^ii{r) corresponds to the pdf of the gamma 
distribution with shape 2 and scale 1/e. Figure |4] shows the 
graph of this function for various values of e. 

It may come as a surprise that this graph differs signif- 
icantly from those in Figures |2] and [3] and in particular, 
that it does not have its maximum in the origin. Remember, 
however, that the graph in Figure |4ja) represents a pdf in 
polar coordinates. More precisely, D^^ii{r) represents the 
probability that the random point is located in the circular 
crown centered in the origin and delimited by r and r-\-dr. 
The area of this crown is proportional to r, hence when r is 
close to also the probability is close to 0. As r increases 
the probability increases, until the factor e~^^ takes over. 
For r approaching infinity, the factor e~^^ approaches 0, 
and dominates over r, hence the probability approaches 
again. 

Thanks to the fact that R and 6 are independent, in order 
to draw a point (r, 0) from De{r^ 9) it is sufficient to draw 
separately r and 6 from De^R{r) and D^^q{9) respectively. 

Since De^Q{0) is constant, drawing is easy: it is suf- 
ficient to generate ^ as a random number in the interval 
[0, 27r) with uniform distribution. 

We now show how to draw r. Following standard lines, 
we consider the cumulative distribution function (cdf) Ce(r): 

Jo 

Intuitively, Ce{r) (Fig |4jb)) represents the probability that 
the radius of the random point falls between and r. Finally, 



Drawing a point (r, 6>) from the polar Laplacian 

1. draw 9 uniformly in [0, 27r) 

2. draw z uniformly in [0, 1) and set r = C~^{z) 

Figure 5. Method to generate Laplacian noise. 

we generate a random number z with uniform probability in 
the interval [0, 1), and we set r = C~^{z). Note that 

c:\z) = -\{w.,{^-^) + i) 

where Wli is the Lambert W function (the —1 branch), 
which can be computed efficiently. 

Given a "universal" Cartesian reference system and the 
actual location xq = (s, t) in this system, if we could work 
in the "ideal" continuous plane, then we would just need to 
generate the noise (r, 0) as specified above, and then reports 
the point x = {s -\- r cos O^t -\- r sin 0). In practice however 
there is always some discretization involved, because (a) 
computers have finite precision, and (b) (more important) 
the coordinates of the "universal reference system" will have 
a finite representation, typically using only a few decimal 
digits. The discretization of our method, and its properties, 
constitute the subject of next section. 

B. Discretization 

In practical applications locations are typically repre- 
sented by means of discrete coordinates. For instance, lati- 
tude and longitude up to some decimal of precision. Thus we 
study here how to define an approximation of the Laplace 
distribution on a grid Q of discrete Cartesian coordinates. 
Again, the property that we need to preserve is that the 
probability of generating a point x in the grid decreases 
exponentially with the distance from the actual location xq. 

Before we start illustrating our method, we wish to explain 
why we did not adopt the following approach, which seems 
the most natural: In the univariate case, the discrete approx- 
imation of the Laplace distribution is the double geometric 
probability distribution Ae~^l^~^°l, where x e N and A 
is a normalization factor. This probability function can be 
visualized as a symmetric series of "steps" exponentially 
decreasing with the (discrete) distance from xq . The obvious 
extension to the bivariate (discrete) case would then be the 
probability distribution K{xo){x) = A'e~^^*^^°'^^ where A' 
is a suitable normalization factor. 

Unfortunately, there does not seem to be an efficient 
way to draw points according to the above distribution. For 
this reason we propose a different approach, that can be 
summarized as follows. Given the actual location xq, we 
report the point x in Q obtained in the following way: 

(a) first, we draw a point (r, 6>) from the polar Laplacian 
centered in xq (see ([2])), as described in Figure |5j 

(b) then, we remap (r, 0) to the closest point x on Q. 

We will denote by i^e : Q the above mecha- 

nism. In summary, K^{xo){x) represents the probability of 
reporting the point x when the actual point is xq. 




Figure 6. Remapping the points in polar coordinates to points in the grid. 

It is not obvious that the discretization preserves geo- 
indistinguishability, due to the following problem: In prin- 
ciple, each point x in Q should gather the probability of the 
set of points for which x is the closest point in Q, namely 

R{x) = {yeR^ \ W e g. d{y, x') < d{y, x')} 

However, due to the finite precision of the machine, the 
noise generated according to ^ is already discretized in 
accordance with the polar system. Let W denote the discrete 
set of points actually generated in ([a]). Each of those points 
(r, 6>) is drawn with the probability of the area between 
r, r -\- Sr, and 6 -\- Sg, where dr and Sq denote the 
precision of the machine in representing the radius and the 
angle respectively. Hence, step (|b]) generates a point x in Q 
with the probability of the set Ry\;{x) = R{x) fl W. This 
introduces some irregularity in the mechanism, because the 
scaly region associated to Ry\;{x) has a different shape and 
area depending on the position of x relatively to xq. 

Figure [6] illustrates the situation. The Cartesian grid con- 
stituted by blue horizontal and vertical lines represents Q. 
The polar grid constituted by black circles and radial lines 
represent W. The two dashed rectangles around the points 
Xq and xi represent R{xo) and R{xi). The regions Rq and 
Ri colored in grey and magenta correspond to R\\;{xo) and 
^w(^i) respectively. Note that Rq and Ri have different 
shapes and areas, for instance Rq is larger than Ri. 

In the next paragraph we show that, despite of the above 
problem, we can still obtain e-geo-indistinguishability, at the 
price of replacing the e of by a smaller e^ 

Geo-indistinguishability of the discretized mechanism: 
We now analyze the privacy guarantees provided by our 
discretized mechanism. We show that the discretization 
preserves geo-indistinguishability, at the price of introducing 
some additional noise. More precisely we show that i^e' 
satisfies e-geo-indistinguishability, within a range rmax, pro- 
vided that e' is chosen in a suitable way that depends on e, 
on the length of the step units of Q, and on the precision of 
the machine. 

For the sake of generality we do not require the step units 
along the two dimensions of Q to be equal. We will call 
them grid units, and will denote by u and v the smaller 
and the larger unit, respectively. We recall that 5e and 5r 



denote the precision of the machine in representing and 
r, respectively. We assume that 5r < Vmax^e- The following 
theorem states the geo-indistinguishability guarantees pro- 
vided by our mechanism. 

Theorem 4.1: Assume rmax < V^e, and let q = ^/r^^^Se. 
Let e, e' G M+ such that 



u q 



2e' 



< e 



Then K^' provides e-geo-indistinguishability within the 
range of rmax- Namely, if d{xQ,x),d{x'Q,x) < r^ax then: 



The difference between e' and e, in Theorem |4.1[ repre- 
sents the extra noise that we need to add in order to com- 
pensate the effect of discretization. Note that rmax, which 
determines the area in which e-geo-indistinguishability is 
guaranteed, must be chosen in such a way that q > 2 ^ . 
Furthermore there is a trade-off between and rmax^ If 
we want e' to be close to e then we need q to be large. 
Depending on the precision, this may or may not imply a 



serious limit on 



Vice versa, if we want Tmax 

to be 



large then, depending on the precision, may need to be 
significantly smaller than e, and furthermore we may have 
a constraint on the minimum possible value for e, which 
means that we may not have the possibility of achieving an 
arbitrary level of geo-indistinguishability. 

Figure [t] shows the relation between e and the maximal e' 
satisfying the condition of Theorem 4.1 In all cases the grid 
unit isix = 3-10~^ Km = 3 m, and the other parameters 
are as follows: 

• The green line corresponds to q = 3 - 10^. For instance 
this value can be obtained with double precision (16 
significant digits, i.e., do = 10~^^) and r^ax = 10^ 
Km. In the case of double precision, even for much 
larger values of rmax (up to about 10^ Km) coincides 
with e. 

• The magenta line corresponds to g = 3 • 10^. This value 
can be obtained with single precision (7 significant 
digits, i.e., Sq = 10~^) and rmax = 10^ Km. In this case 
we cannot go much higher for rmax without diverging 
dramatically from e. Furthermore, the smallest possible 
value for e is about 4.5, which means that at most we 
can ensure 4.5-geo-indistinguishability. 

• The blue line corresponds to g = 3 • 10^, which can 
still be obtained with single precision at the price of 
reducing previous rmax by a factor 10 (rmax = 10 Km). 
Alternatively we could obtain this value by increasing 
both the precision and rmax^ For instance, with an in- 
termediate precision of 9 significant digits (Sq = 10~^) 
and rmax = 10^ Km. 

Note that in Theorem |4.1| the restriction about rmax is 
crucial. Namely, e-geo-indistinguishability does not hold for 
arbitrary distances for any finite e. Intuitively, this is because 




Figure 7. The relation between e and e' for various precisions. 

the step units of W (see Figure |6]) become larger with the 
distance r from xq. The step units of Q, on the other hand, 
remain the same. When the steps in W become larger than 
those of Q, some x's have an empty Ry\;{x). Therefore when 
X is far away from xq its probability may or may not be 0, 
depending on the position of xq in Q, which means that 
geo-indistinguishability cannot be satisfied. 

On the other hand, the restriction on rmax is not a strong 
limitation, because the distribution decreases exponentially 
with r, and rmax is usually large, hence the points with 
distance r > rmax have negligible probability. 

C. Truncation 

In practical applications we are typically interested in 
locations within a certain region. The Laplacian mechanisms 
described in previous sections, however, has the potential to 
generate points everywhere in the plane. If the user knows 
that the actual location is situated within a certain region, 
it seems desirable that the reported location lies within 
the same region as well, or at least not too far apart. To 
this purpose we propose a variant of the discrete Laplacian 
described in previous section, which generates points only 
within a specified region. 

We assume that the specified region A of acceptable report 
points is a circle centered in o, and diameter diam{A). 
Our mechanism works like the discretized Laplacian of 
previous section, with the difference that, whenever the point 
generated in step ^ lies outside A, we remap it to the 
closest point in ^ fl ^ (which necessarily will be on the 
perimeter of A, modulo discretization). 

Let us denote by Kji the truncated variant of the mech- 
anism K^' described in previous section. The type is: 
Kji : A V{A n Q) and the drawing is described by 
the following procedure. Given the actual location xq G A: 
(a) first, draw a point (r, 6>) from the polar Laplacian 

centered on xq, as explained in previous section, 
(bO then, remap (r, 0) to the closest point x on A HQ. 

Intuitively, behaves like K except when the region 
R{x) is on the border of A. In this case, the probability 
on X is given not only by the probability of the points in 
i?yv;(x), but also by the probability of the part of the cone 
determined by o and R{x) which lies outside A. 

We are now going to show that this new method satisfies 
geo-indistinguishability on all A, provided that rmax is not 



smaller than diam{A). 

Theorem 4.2: Let rmax, e and e' satisfy the premise of 
> diam{A), then Kj, provides e-geo- 



Theorem 



4.1 



If rn 



indistinguishability within A. 

In the following we generally assume A = Tmax- 

V. Enhancing LBS s WITH Privacy 

In this section we present a case study of our privacy 
mechanism in the context of LBSs. In particular we show 
how to enhance LBS applications with privacy guarantees 
while still providing a high quality service to their users. 

A. Geo-indistinguishability for POI retrieval LBSs 

Let us start by describing how geo-indistinguishability can 
be used to specify a subtle notion of privacy for LBS applica- 
tions. For that purpose, we first delineate the architecture of 
LBS applications that we consider in this work. We assume 
a simple client- server architecture where users communicate 
via a trusted mobile application (the client - typically 
installed in a smart-phone) with an unknown/untrusted LBS 
provider (the server - typically running on the cloud). Hence, 
our approach does not rely on trusted third-party servers 
(in contrast to several solutions proposed in the literature). 
Additionally, since this work focuses on the potential harm 
incurred to users by conferring their location to a LBS, we 
assume that users only communicate location information 
to the provider (although typically more information, such 
as user ID and network address, is transmitted). Figure [8] 
illustrates the LBS setting that we consider in this work. 

User's current location x 





POI info around x 
User LBS server 

Figure 8. LBS architecture 

For illustration purposes, in this section we will focus 
on LBSs applications providing POI information. However, 
most of the discussion and techniques presented in the 
following, hold for a broader family of LBS applications 
(some of which we mention explicitly below). 

Coming back to our running example, we now study 
how geo-indistinguishability can help to provide privacy 
guarantees to the user visiting Paris. More precisely, let us 
assume that the user is sitting at Cafe Les Deux Magots 
and wishes to obtain information about nearby restaurants 
without revealing to a potential attacker (the LBS provider in 
this case) his exact location. However, as discussed before, in 
order to obtain accurate information from the LBS provider, 
the user is willing to reveal some approximate information. 

This privacy guarantee can be captured by our notion 
of e-geo indistinguishability. Letting the user specify his 
desired level of privacy, say I = In (4) within r = 0.2 km 
(and decreasing proportionally for larger distances), in(4)/o.2- 
geo-indistinguishability guarantees the user that by using the 



Sanitizing Algorithm for a Location - NoisyPt 

Input: X 1 1 point to sanitize 
e / / privacy parameter 
u, V, 5e, 5r II precision parameters - Section 
A II region of acceptable locations - Section 

Output: Sanitized version x' of input x 

1. 
2. 
3. 
4. 
5. 
6. 
7. 



IV-B 



4.2 



e = safe_e{e, v, q); II Theorem 
Draw angle ^ Uniform{2 7T); II Figure |5l 
Draw radius r ^ gamma{2, ^/e'); II Figure p| 
x' = Pt{x, p,0); II sanitized location 
if X ^A then x = closest Pt{A, x, p,0); //truncation 
return x; 



Figure 9. Our sanitizing algorithm for a location. 

LBS application (and thus revealing his approximate loca- 
tion), the LBS provider cannot infer his real location (at least 
not with probability 4 times higher than without revealing 
his location) among all locations within 200 meters. 

B. Privately Retrieving POI information from a LBS 

We now proceed to describe how to enhance LBS ap- 
plications with geo-indistinguishability guarantees. In the 
following we distinguish between mildly-location-sensitive 
and highly-location-sensitive LBS applications. 

The former category corresponds to LBS applications of- 
fering a service that does not heavily rely on the precision of 
the location information provided by the user. Examples of 
such applications are weather forecast applications (forecast 
information for an approximate location is typically as good 
as forecast information for an exact location), location-aware 
advertising/offers (eg, shops offering discounts typically care 
about users being nearby - rather than their exact location), 
and a number of LBS applications for POI retrieval (eg, 
retrieving nearby cheap gas stations or nearby tourist sites 
when visiting a city). Enhancing this kind of LBSs with geo- 
indistinguishability privacy guarantees is relatively straight- 
forward. It requires to implement the location perturbation 



mechanism presented in Section IV on the client party of 
the LBS application and then report the sanitized location 
(instead of the real location) to the LBS server party. We note 
that this simple modification does not impose a significant 
computation overlay on the client side nor extra bandwidth 
usage. Figure [9] delineates a location sanitizing algorithm 



based on the techniques described in Section |IV-B 



For highly-location- sensitive LBS applications, on the 
other hand, the quality of the service provided by LBSs 
highly depends on the precision of the location information 
submitted by the user. Our running example lies within this 
category. For the user sitting at Cafe Les Deux Magots, 
information about restaurants nearby Champs Elysees is 
considerably less valuable than information about restaurants 
around his location. Enhancing this kind of LBS applications 
with privacy guarantees is considerably more challenging. 
In the following we describe how to enhance this kind 




Figure 10. Retrieval information situation for private LBS 

of LBS applications with privacy guarantees while still 
providing a high quality service. Our approach requires three 
modifications to the standard LBS architecture: 

1) The algorithm illustrated in Figure [9] should be imple- 
mented on the client application in order to report to 
the LBS server party the user's approximate location z 
rather than his real location x. 

2) Due to the fact that the information retrieved from 
the server is about POI nearby z, the area of POI 
information retrieval should be increased. In this way, if 
the user wishes to obtain information about POI within, 
say, 300 meters of x, the client application should 
request information about POI within, say, 1 km of 



z. Figure 10 illustrates the situation for our running 
example. The user's current location x is at cafe Les 
Deux Magots and the reported approximate location 
z submitted by the client application is at about 500 
meters from x. We will refer to the circle centered at x 
with 300 meters radius as area of interest (of the user) 
and to the circle centered at z with 1 km radius as area 
of retrieval. 

3) Finally, the client application should filter the retrieved 
POI information (depicted by the pins within the area 
of retrieval in Figure [TO]) in order to provide to the user 
with the desired information (depicted by pins within 
the user's area of interest in Figure [TO]). 



The resulting client-server interaction is shown in Fig 1 1 




- User's approximate location z 
-Area of retrieval A 



POI info within A 



User LBS server 

Figure 11. LBS architecture 

Clearly, for our approach it is crucial that the area of 
interest is fully contained in the area of retrieval (otherwise 
the information expected by the user might not be fully 
retrieved from the server). However, the latter depends on 
a randomly generated location, hence such condition cannot 
be guaranteed (at least not with probability 1). Note that 
the client application could dynamically adjust the area of 




Figure 12. (a, 5) -usefulness for r = 0.2 and various values of t. 

retrieval in order to ensure that it always contains the area 
of interest. However, this approach would jeopardize the 
privacy guarantees: on the one hand, the size of the area of 
retrieval would leak information about the user's real loca- 
tion and, on the other hand, the LBS provider would know 
with certainty that the user is located within the retrieval 
area. Therefore, in order to provide geo-indistinguishability 
in this setting, the area of retrieval should be defined 
independently from the randomly generated location. 

Our approach consists on statically defining the area of 
retrieval as a function of the security parameters {i and 
r) and of the area of interest. Our goal is to define an 
area of retrieval as small as possible (in order to avoid 
retrieving unnecessary information and, consequently, un- 
necessary bandwidth usage) in a way that the area of interest 
is contained in it with probability as high as possible. Since 
such goal highly depends on the accuracy of the mechanism 
generating the approximate location (ie, on how close the 
generated location and the real location are to each other) 
before presenting our solution we need to introduce the 
notion of accuracy for data sanitation mechanisms. 

C. Accuracy for location perturbation mechanisms 

As it is standard for privacy enhancing mechanisms based 
on data perturbation (eg, the Laplacian mechanism providing 
standard differential privacy EU), the aim of our mechanism 
is to provide accurate (location) information in a private way 
(ie, while satisfying geo-indistinguishability). 

In order to evaluate the accuracy of our mechanism, 
we use (a, (5)-usefulness |19|, a well-known concept from 
the literature (adapted to our location setting) that aims 
at assessing the accuracy of the approximate information 
generated by a mechanism. 

A location perturbation mechanism /C is (a, 5)-useful if 
for every location x, with probability at least 1 — (5, the 
reported location z = JC{x) satisfies d{x^ z) < a. In other 
words, a (a,^)-useful mechanism generates approximate 
locations z within distance a of the exact location x with 
probability at least 1 — (5. In the case of our mechanism, 5 can 
be computed using the cdf of the Gamma distribution from 



which the radius is drawn. Figure 12 illustrates how our 
mechanism behaves with respect to (a,^)-usefulness when 
providing e-geo-indistinguishability for r = 0.2 (as in our 
running example) and several values of i. 



It follows from the information in Figure 12 that a 
mechanism providing the privacy guarantees specified in our 



running example (e-geo-indistinguishability, with I = In (4) 
and r = 0.2) generates an approximate location z falling 
within 1 km of the user's location x with probability 0.99, 
falling within 690 meters with probability 0.95, falling 
within 560 meters with probability 0.9, and falling within 
390 meters with probability 0.75. 

We now have all the necessary ingredients to define an 
area of retrieval containing the area of interest with a given 
probability. Note that an area of retrieval with radius, say, 
ta contains the area of interest with radius say, r/, with 
probability at least 1 — 6 if the mechanism used to generate 
the reported location is (a,^)-useful, for an a < — r/. 

Therefore, by setting ta to 1 km in our running example 
and since our mechanism is (0.69, 0.05)-useful, it is guaran- 
teed that the retrieval area contains the area of interest with 
probability at least 0.95. 

D. Further challenges: using a LBS multiple times 

After describing how to provide geo-indistinguishability 
guarantees to users querying a LBS application a single time, 
we now discuss how to extend our solution to the case in 
which users wish to perform multiple queries. 

In this scenario, the mechanism should protect multiple 
locations rather than one. But, what does it mean to enjoy 



privacy for multiple locations? As discussed in Section III-E 



geo-indistinguishability can be naturally extended to this 
scenario. In short, the idea of being l-private within r 
remains the same but for all locations simultaneously. In this 
way the locations, say, xi, X2 of a user employing the LBS 
twice remain indistinguishable from all pair of locations at 
(point- wise) distance at most r (ie, from all pairs x[, x'^ 
such that d{x\^x'^ < r and d{x2-,x'2) < r). 

A simple way of obtaining geo-indistinguishability guar- 
antees when performing multiple queries is to employ our 
technique for protecting single locations to independently 
generate approximate locations for each of the user's lo- 
cations. In this way, a user performing n queries via a 
mechanism providing e-geo-indistinguishability enjoys ne- 



geo-indistinguishability (see Theorem (3^). 

This solution might be satisfactory when the number of 
queries to perform remains fairly low, but in other cases 
impractical, due to the privacy degradation. It is worth 
noting that the canonical technique for achieving standard 
differential privacy (based on adding noise according to the 
Laplace distribution) suffers of the same privacy degradation 
problem (e increases linearly on the number of queries). 
Several articles in the literature focus on this problem 
(see 1201 for instance). We believe that the principles and 
techniques used to deal with this problem for standard 
differential privacy could be adapted to our scenario (either 
directly or motivationally). A fruitful direction to explore, in 
our particular scenario, is to employ the location history of 
the user together with the corresponding locations reported 



to the LBS provider (ie, (x, z) pairs) to "adjust" the way ap- 
proximate locations are generated (eg, report z whenever the 
user's location x^ is nearby a location x that the mechanism 
has previously reported as z). This challenge constitutes our 
main focus for future work. 

VI. Sanitizing datasets: US census case study 

In this section we present a sanitation algorithm for 
datasets containing geographical information. Roughly 
speaking, the algorithm iteratively sanitizes each of the 
geographic sensitive values in the dataset by means of the 
perturbation technique presented in Section |lv| 

A. The LODES dataset 

We consider a realistic case study involving publicly 
available data developed by the U.S Census Bureau's Lon- 
gitudinal Employer-Household Dynamics Program (LEHD). 
These data, called LEHD Origin-Destination Employment 
Statistics (LODES), are used by OnTheMap, a web-based 
interactive application developed by the US Census Bureau. 
The application enables, among other features, visualization 
of geographical information involving the residence and 
working location of US residents. 

The LODES dataset includes information of the form 
{hBlock, w Block), where each pair represents a worker, the 
attribute hBlock is the census block in which the worker 
lives, and w Block is the census block where the worker 
works. From this dataset it is possible to derive, by mapping 
home and work census blocks into their corresponding geo- 
graphic centroids, a dataset with geographic information of 
the form (hCoord , wCoord), where each of the coordinate 
pairs corresponds to a census block pair. 

Due to privacy constraints and legal issues, data involving 
the residence location of individuals cannot be released 
without previous sanitation; thus, the LODES dataset is a 
sanitized version of the real data. However, for illustration 
purposes and wlog, in the remaining of this section we will 
treat the LODES dataset as if it were the real data. The 
Census Bureau uses a synthetic data generation algorithm 
1241 . |[T2ll to sanitize the LODES dataset. Roughly speaking, 
the algorithm interprets the dataset as an histogram where 
each {hBlock^ wBlock) pair is represented by a histogram 
bucket, the synthetic data generation algorithm sanitizes data 
by modifying the counts of the histogram. For that purpose, 
a statistical model is built from the LODES dataset and then 
a sanitized counterpart is obtained by sampling points from 
the model. 

In the following we present a sanitizing algorithm for 
datasets with geographical information (eg, the LODES 
dataset) that provides formal privacy guarantees. In partic- 
ular, our algorithm provides geo-indistinguishability guar- 
antees under the assumption that the home census blocks 
values in the dataset are uncorrected (see the discussion 



about uncorrected points in Section III-E). Although this 



assumption weakens the privacy guarantees provided by geo- 
indistinguishability, we believe that due to the anonymizing 
techniques appHed by the Census Bureau to the released 
data involving census participants' information and to the 
large number of (hCoord, work_coord) pairs within small 
areas contained in the dataset, a practical attack based on 
correlation of points is unlikely. 

B. The Sanitizing Algorithm for a dataset of locations 



Our sanitizing algorithm, described in Figure [T3j takes as 
input (1) a dataset D to sanitize, (2) the privacy parameters £ 
and r (see Section|Ill|, and (3) the precision parameters v. 



5r and 5o, and the region A. (see Section IV-B ) and returns 
a sanitized counterpart of D. The algorithm is guaranteed to 
provide Yr-geo-indistinguishability to the home coordinates 
of all individuals in the dataset (see discussion on protecting 



multiple locations in Section [III-E| ). 

We note that, in contrast to the approach used by the 
Census Bureau based on histogram's count perturbation, our 
algorithm modifies the geographical data itself (residence 
coordinates in this case). Therefore, our algorithm works 
at a more refined level than the synthetic data generation 
algorithm used by the Census Bureau; a less refined dataset 
can be easily obtained however - by just remapping each 
{hCoord^wCoord) pair produced by our algorithm to its 
corresponding census block representation. 

C. Experiments 

In order to evaluate the accuracy of the sanitized dataset 
generated by our algorithm (and thus of our algorithm as a 
data sanitizer) we implemented our perturbation mechanism 
and conducted a series of experiments focusing on the 
"home-work commute distance" analysis provided by the 
OnTheMap application. This analysis provides, for a given 
area (specified as, say, state or county code), a histogram 
classifying the individuals in the dataset residing in the 
given area according to the distance between their residence 
location and their work location. The generated histogram 
contains four buckets representing different ranges of dis- 
tance: (1) from zero to ten miles, (2) from ten to twenty 
five miles, (3) from twenty five to fifty miles, and (4) more 
than fifty miles. 

Sanitizing Algorithm for a Dataset of Locations 
Input: D : hCoord x wCoord II dataset to sanitize 

I, r, u, V, 6r, Se, A II same as in Figure [5] 
Output: Sanitized version D' of input D 

1. D' — ^\ II initializing output dataset 

2. e^i/r- 

3. for each {ch^Cw) G D do 

4. = NoisyPt{ch, e,u,v,Se,Sr,A); II sanitized point 

5. D' = D' U {(c'f^^Cw)}', II adding sanitized point 

6. end-for 

7. return D'; 

Figure 13. Our sanitizing algorithm, based on data perturbation 




(a)r= 1.22 

Figure 14. Home-work commute distance for various levels £. 

We have chosen the San Francisco (SF) County as resi- 
dence area for our experimental analysis. Additionally, we 
restrict the work location of individuals residing in the 
San Francisco county to the state of California. The total 
number of individuals satisfying these conditions amounts to 
374.390. All experiments have been carried on using version 
6.0 of the LODES dataset. In addition, the mapping from 
census blocks to their corresponding centroids has been done 
using the 2011 TIGER census block shapefile information 
provided by the Census Bureau. 

We now proceed to compare the LODES dataset - seen as 
a histogram - with several sanitized versions of it generated 



by our algorithm. Figure 14 (a) depicts how the geographical 
information degrades when fixing r to 1.22 miles (so to 
ensure geo-indistinguishability within 10% of the land area 
of the SF County) and varying £. The precision parameters 
were chosen as follows: u = 10~^ miles, ^'s diameter was 
set to 10^ miles, and the standard double precision values 
for Sr and Sq (for the corresponding ranges). 

We have also conducted experiments varying r and 
fixing i. For instance, if we want to provide geo- 
indistinguishability for 5%, 10%, and 25% of the land area 
of the SF county (approx. 46.87 mi^), we can set r = 0.86, 
1.22, and 1.93 miles, respectively. Then by taking £ = ln(2) 
we get an histogram very similar to the previous one. This 
is not surprising as the noise generated by our algorithm 
depends only on the ratios ^r, which are similar for the 
values above. 

As shown in Figure [14] (a), our algorithm has little effect 
on the bucket counts corresponding to mid/long distance 
commutes: over twenty five miles the counts of the sanitized 
dataset are almost identical to those of the input dataset - 
even for the higher degrees of privacy. For short commutes 
on the other hand, the increase in privacy degrades the 
accuracy of the sanitized dataset: several of the commutes 
that fall in the 0-to-lO-miles bucket in the original data fall 
instead in the lO-to-25 -miles bucket in the sanitized data. 

After analyzing the accuracy of the sanitized datasets 
produced by our algorithm for several levels of privacy, 
we proceed to compare our approach with the one followed 
by the Census Bureau to sanitize the LODES dataset. Such 
comparison is unfortunately not straightforward; on the one 
hand, the approaches provide different privacy guarantees 



(see discussion below) and, on the other hand, the Census 
Bureau is not able to provide us with a (sanitized) dataset 
sample produced by their algorithm (which would allow us 
to compare both approaches in terms of accuracy) as this 
might compromise the protection of the real data. 

The algorithm used by the Census Bureau satisfies a 
notion of privacy that called (e, (5) -probabilistic differen- 
tial privacy, which is a relaxation of standard differential 
privacy that provides e-differential privacy with probability 
at least 1 — S |[T2ll . In particular, their algorithm satisfies 
(8.6, 0.00001) -probabilistic differential privacy. This level 
of privacy could be compared to geo-indistinguishability 
for £ = 8.6 and r = 3.86, which corresponds to providing 
protection in an area of the size of the SF County. Figure 
T4| (b) presents the results of our algorithm for such level of 
privacy and also for higher levels. 

It becomes clear that, by allowing high values for £ 
(£ = 8.6 = ln(5432), ^ = 4.3 = ln(74), and £ = 2.15 = 
ln(9)) it is possible to provide privacy in large areas without 
significantly diminishing the quality of the sanitized dataset. 

VII. Related Work 

Much of the related work has been already discussed in 
Section |Il| here we only mention the works that were not 
reported there. We refer to |25 1 for an excellent survey on 
privacy methods for geolocation. 

LISA |26| provides location privacy by preventing an 
attacker from relating any particular point of interest (POI) 
to the user's location. That way, the attacker cannot infer 
which POI the user will visit next. The privacy metric used 
in this work is m-unohserv ability. The method achieves m- 
unobservability if, with high probability, the attacker cannot 
relate the estimated location to at least m different POIs in 
the proximity. 

SpaceTwist |27| reports a fake location (called the "an- 
chor") and queries the geolocation system server incremen- 
tally for the nearest neighbors of this fake location until the 
/c-nearest neighbors of the real location are obtained. 

VIII. Conclusion and future work 

In this paper we have presented a framework for achieving 
privacy in location-based applications, taking into account 
the desired level of protection as well as the side-information 
that the attacker might have. The core of our proposal is a 
new notion of privacy, that we call geo-indistinguishability, 
and a method, based on a bivariate version of the Laplace 
function, to perturbate the actual location. We have put 
a strong emphasis in the formal treatment of the privacy 
guarantees, both in giving a rigorous definition of geo- 
indistinguishability, and in providing a mathematical proof 
that our method satisfies such property. We also have shown 
how geo-indistinguishability relates to the popular notion of 



differential privacy. Finally, we have illustrated the applica- 
bility of our method with two case studies: interaction with a 
POI-retrieval service, and sanitization of the LODES dataset. 

In the future we aim at extending our method to cope with 
more complex applications, possibly involving the sanitiza- 
tion of several (potentially related) locations. One important 
aspect to consider when generating noise on several data 
is the fact that their correlation may degrade the level of 
protection. We aim at devising techniques to control the 
possible loss of privacy and to allow the composability of 
our method. 

In a recent paper |l28l it has been shown that, due to finite 
precision and rounding effects of floating-point operations, 
the standard implementations of the Laplacian mechanism 
result in an irregular distribution which causes the loss 
of the property of differential privacy. The same paper 
proposes a solution based on a post-processing snapping 
procedure. In our setting, we suspect that we encounter the 
same kind of problem when we draw the radius according 
to the bivariate Laplacian. We believe that the solution 
proposed in |28| applies also to our case, and that the 
snapping procedure will cause an effect equivalent to a loss 
of precision. This means that, even in a double-precision 
machine, the remapping from polar to cartesian coordinates 
may require a non-negligible additional amount of noise in 
order to preserve differential privacy, i.e. the gap between 
e and e' (cf. Figure [t]) may become larger. We plan to 
investigate this relation more accurately in the future. 
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Appendix 

In this appendix we provide the technical details that have 
been omitted from the main body of the paper. 



A. Results from Section III 



Theorem 3.1: Geo-indistinguishability-I, II, III coincide. 
Proof: The equivalence of Geo-indistinguishability-I 
and III can be shown by applying B ayes' law. We show here 
the equivalence between Geo-indistinguishability-II and III. 

Assume that K satisfies geo-indistinguishability-III. We 
first show that for all r > 0: 

P{S\Br{x)) 

= ns.x'\Br{x)) 

= J2 Px{x'\Br{x))K{x'){S) 

> J2 Px{x'\Br{x))e-'''K{x){S) d{x,x')<r 

X' ^Br{x) 

= e-'''K{x){S) 



P{S\x) 



-P{x\Br{x)) < e'TixlBrix)) 



Then 

Pix\S,BAx))- p^^^^^^^^y 

For the opposite direction, let xi^X2 G A', let r = d{xi^X2) 
and define a prior distribution Px{x) as: 

t X = Xi 

Pxi^) = {^-^ X = X2 

otherwise 

v 

Using that prior for t G (0, 1) we have for all S: 

K{x,){S) = P{S\x,) 

= P{S\Xi,Br{Xi)) 
_ P{x,\S,Br{x,)) 

< e'''P{S\Br{xi)) 

< e^'' P{S,x\Br{xi)) 

<e'''{tP{S\xi) + {l-t)P{S\x2)) 
<e'^{tK{x,){S) + {l-t)K{x2){S)) 

Note that we need t G (0,1) so that Px{xi),Px{x2) are 
positive and the conditional probabilities can be defined. 
Finally, taking the lim^^o on both sides of the above 
inequality we get K{xi){S) < e"K{x2){S) ■ 
Theorem 3.2: If /Co satisfies e-geo-indistinguishability, 
then /C satisfies ne-geo-indistinguishability. 



Proof: Let x = (xi, . . . , x^), = (x^^, . . . , x'^) such 
that (ioo(x, x') < r. This impHes that d{xi^x[) < r, 1 < 
i < n. We have: 

P(z|x) = n,P(^i|a^i) 
= e"^''P(z|x') 




Figure 15. Bounding the probability of x in the discrete Laplacian. 



B. The planar laplacian satisfies geo-indistinguishability 

Given the definition of D^{x{)){x) in ([T]), by triangular 
inequaUty we have 

D,{x^){x)<e'^^^^^^'^^D,{x'^){x) 

Using well-known properties of integrals, we derive 

/ D,{xQ){x)ds< [ e^'^^^'^^'o) D,{x'Q){x)ds 
Js Js 

and 

/ L>e(^o)(^)^^5 < e'^(^°'^o^ / D,{x'Q){x)ds 
Js Js 

Now, taking into account the definition of K: 



we derive 



K{xo){S) = [ D,{xo){x)ds 
Js 



C. The discretization preserves geo-indistinguishability 
Theorem 4.1: Assume rmax 

< u/5e, and let q = ^/r^^^de. 

Let e, e' G M+ such that 

- ' m 



< e 



u q — 2 e^''^ 

Then K^f provides e-geo-indistinguishability within the 
range of r^ax. Namely, if d{xQ,x),d{x'Q,x) < r^^.^ then: 

K,.{xo){x) < e'^M)K,.{x',){x). 

Proof: The case in which xq = Xq is trivial. We con- 
sider therefore only the case in which xq ^ Xq. Note that in 
this case d{xo,XQ) > u. We proceed by determining an up- 
per bound on i^^e/(xo)(x) and a lover bound on K(.'{x'q){x) 
for generic xq, x'q and x such that (i(xo, x), (^(xq, x) < rmax- 
Let S be the set of points for which x is the closest point 
in namely: 

S = R{x) = {y eR'^l W e g. d{y, x') < d{y, x')} 

Ideally, the points remapped in x would be exactly those 
in S. However, due to the finite precision of the machine, 
the points actually remapped in x are those of Ry\;{x) (see 



or minus the small rectangle^ W of size Sr x rSe at the 
border of S, where r = d{xo^x), see Figure [Ts] Let us 
denote by Sw the total area of these small rectangles W on 
one of the sides of S. Since d{xo^x) < rmax < V^a, and 
< ^max^^6>. wc havc that Sw is less than i/g of the area of 
5*, where q = ^/r^^^Se. The probability density on this area 
differs at most by a factor ^ from that of the other points 
in S. Finally, note that on two sides of 5* the rectangles W 
contribute positively to K^f{xQ){x), while on two sides they 
contribute negatively. Summarizing, we have: 

Ke'{x^){x) < (1 + ^) / D,>{x^){x^)ds (3) 

JS 



and 



(1 



) / D,.{x'^){xi)ds < K,.{x'^){x) (4) 
Js 



Observe now that 

D^>{xq){xi) 



e' {d{xQ,xx) — d{x'Q,xx)) 



By triangular inequality we obtain 

from which we derive 

/ D,^{xo){xi)ds<e''^^^'^<^ [ D,\x'^){x^)ds (5) 
Js Js 

from which, using ([3}, ([5]), and (|4]), we obtain 
Assume now that 

/I 1 g + 2e 



q - 2e^'^ 



In 



u q — 2e^ 
Since we are assuming d{xo^XQ) > u, we derive: 

d{xQ,XQ) ^ ^ ^ ^ed{xo,XQ) 

q -2e^'^ ~ 
Finally, from ([6]) and (|7]), we conclude. 



(6) 



(7) 



Section IV-B ). Hence the probability of x is that of 5* plus 



is actually a fragment of a circular crown, but since 5o is very small, 
it approximates a rectangle. Also, the side of W is not exactly r 5^ , it is a 
number in the interval [(r — ^/V^) ^g), (r + ^v^) ^e] - However uj^Sg is 
very small with respect to the other quantities involved, hence we consider 
negligible this difference. 



Figure 16. Probability of x in the truncated discrete laplacian. 



D. The truncation preserves geo-indistinguishability 

Theor em 4 .2: Let rmax, e and e' satisfy the premise of 
If ^max > diam{A), then Kj, provides e-geo- 



4.1 



Theorem 

indistinguishabihty within A. 

Proof: The proof proceeds Hke the one for Theorem 4.1 
except when R{x) is on the border of A. In this latter case, 
the probability on x is given not only by the probability on 
R{x) (plus or minus the small rectangles W - see the proof 



of Theorem 4.1 ), but also by the probability of the part C of 
the cone determined by o, R{x), and lying outside A (see 
Figure [16]). Following a similar reasoning as in the proof of 
Theorem |4.1| we get 

Kj.{xo){x) < (1 + ^) / D,.{xo){xi)ds 
Q J sue 



and 



2 pC n 

(1 ) / D,\x'^){x^)ds < Kj.{x'^){x) 



Q J sue 

The rest follows as in the proof of Theorem |4.1| 



